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ABSTRACT 

We present results on the clustering of 282,068 galaxies in the Baryon Oscillation 
Spectroscopic Survey (BOSS) sample of massive galaxies with redshifts 0.4 < z < 0.7 
which is part of the Sloan Digital Sky Survey III project. Our results cover a large 
range of scales from ^ 500 h^^ kpc to ^ 90 Mpc. We compare these estimates 
with the expectations of the flat ACDM standard cosmological model with parameters 
compatible with WMAP7 data. We use the MultiDark cosmological simulation, one 
of the largest TV-body runs presently available, together with a simple halo abundance 
matching technique, to predict the galaxy correlation functions, power spectra, abun- 
dance of satellites and galaxy biases. We find that the ACDM model gives a reasonable 
description to the observed correlation functions at z ~ 0.5, which is a remarkably 
good agreement considering that the model, once matched to the observed abundance 
of BOSS galaxies, does not have any free parameters. However, we find a small (~ 10%) 
deviation in the correlation functions for scales around 10-30 Mpc. A more re- 
alistic abundance matching model and better statistics from upcoming observations 
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are needed to clarify the situation. We also predict that about 7% of the galaxies in 
the sample are most probably satellites inhabiting central haloes with mass M > 10^'* 
Mq. Using the MultiDark simulation we also study the scale-dependent galaxy 
bias h and find that 6 ~ 2 for BOSS galaxies at scales > 10 Mpc. The large-scale 
bias, defined using the extrapolated linear matter power spectrum, depends on the 
maximum circular velocity V^^ax of galaxies as 6 = 1 + (yniax/361km s"^)*/'^, or on 
the galaxy number density rig as 6 = 0.0377 — 0.57 logj^g (jig/h^ Mpc~'^). The damping 
of the BAO signal produced by non-linear evolution leads to ^ 2-4% dips in the large- 
scale bias factor defined in this way. Very accurate fits as a function of abundance and 
maximum circular velocity of galaxies are provided. 

Key words: cosmology: large-scale structure of the Universe - cosmology: theory - 
galaxies: general - methods: observational - methods: numerical 



1 INTRODUCTION 

The clustering of galaxies is a fundamental measure of the 
statistical properties of the cosmic density field through 
cosmic time. In the last decade, it became possible to de- 
termine the clustering strength of galaxy populations at 
spatial scales out to tens of Mpc and beyond with rea- 
sonable accuracy by means of massive galaxy surveys such 
as the Two-Degre e Field Galaxy Redshift Survey (e.g., 
IColless et all 1200111 and Slo a n Digital Sky Survey (SDSS- 



I/II; e.g., iGunn et al.l 1 19981 : lYork et all I2OO0I : iGunn et all 



l2006h . These and previous studies have shown that the corre- 
lation function is not a simple power-law and that the corre- 
lation length of luminous and massive galaxies is larger than 
that of less luminous ones (see lZehavi et al]|201ll . and refer- 
ences therein). Furthermore, it has been also shown that the 
clustering strength of Luminous Red Galaxies (LRGs) is an 
excellent tracer of the Baryon Acoustic Oscillation (BAO) 
signal, which can be used to constrain the ex pansion history 
of the Universe (e.g.. lEisenstein et al.lboosl 'l. 

The Baryon Oscillation Spec troscopic Survey (BOS S), 
a branch of the ongoing SDSS-III (|Eisenstein et al.]|201ll ). is 
considerably increasing the size of available galaxy samples. 
BOSS consists of galaxy and quasar spectroscopic surveys 
over a sky area of 10,000 deg^ and its main goal is to mea- 
sure the BAO feature at high precision. Specifically, BOSS 
aims at measuring the redshifts of about 1.5 million galaxies 
out to z = 0.7. It will also acquire about 150,000 Lya forest 
spectra of quasars in the range 2.2 < z < 4, to map the 
larg e-scale distribution of galaxies at these earlier epochs 
(see lSlosar et al.ll201ll ). The effective volume of the galaxy 
survey is expected to be about 7 times higher than that of 
the SDSS-I/II LRG sample which consisted of ~ 100,000 
galaxies out to z = 0.45. The selection criteria of the BOSS 
targets results in a sample of massive, and hence highly clus- 
tered systems, which are suitable candidates for a reliable 
detection of the acoustic peak. Additionally, the project also 
provides a wealth of other information on clustering and 
physical properties of galaxies. 

Requirements for theoretical predictions of galaxy clus- 
tering in BOSS are extreme: one needs accurate predictions 
for very large volumes in order to compare with observa- 
tions. Therefore, the combination of large-volume cosmo- 
logical A''-body simulations with prescriptions to associate 
galaxies with dark matter haloes turns out to be the most 
efficient way to generate the required model galaxy sam- 



ples. Recently, IWhite et all (|201ll ) presented clustering re- 
sults for scales in the range ~ 0.5-20 Mpc based on 
~ 44, 000 galaxies in the redshift range 0.4 < z < 0.7 
obtained during the first semester of BOSS operation. To 
compare these observational results with theory, the au- 
thors combined large, albeit low-resolution, A'^-body sim- 
ulations wi th the Halo Occupation Distribution (HOD) 
model (e.g.,lBerlind fc Weinberg||2002l:lKraytsov et al.ll2004 



Zentner ct al."2005': jSkibba fc Sliethll2009l : IRoss fc Brunnerl 
2009; Ross ct al. 2oiO)~ Their results suggest that the ma- 
jority of BOSS galaxies are central systems living in haloes 
with a mass of - 10" h-^ Mq, while about 10% of them 
are satellites typically residing in haloes ~ 10 times more 



The HOD approach is the most often used framework 
to make predictions for the large-scale distribution of galax- 
ies. Alte rnatively, HODs c a n also be measured in obser- 
vations (IZehavi et al.l 20051 : lAbazaiianl l2005l : iBrown et al.l 
l2008l : IZheng et all I2009D . The main component of classi- 
cal HOD models is the probability, P{N\M), that a halo 
of virial mass M hosts A*' galaxies with some specified prop- 
erties. In general, theoretical HODs r equire the fitting of a 
function with seve ral parameters (e.g.. lKravtsov et al. 1 120041 : 
IZheng et al1l2005h . which gives some freedom to match the 
observed clustering of galaxies. These models also depend on 
the theoretical approach adopted to predict the galaxy num- 
ber A ^ inside haloes of mass M. For example, IZheng et al.l 
(|2005l ) used SPH simulations and semi-analytical models to 
measure the number of galaxies as a function of hosting halo 
m ass, which is definitely a challenging theoretical exercise. 
White et al.l (|2011i ') tuned five HOD free parameters to fit 
the observed clustering of galaxies. In this case a random 
fraction of dark matter particles is selected from the simu- 
lations with a fraction following the optimized HOD. This 
prescription will have the best match to observations hence 
producing good-quality mock catalogs. How ever, this is not 
the be st way of testing a cosmological model. iKravtsov et al.l 
(|2004h used a different approach: they identify subhaloes 
in high-resolution A'^-body simulations in order to associate 
them with satellite galaxies. This is a more attractive path, 
which can be further perfected by more accurate simula- 
tions and more elaborate pres criptions for "galaxies" in da rk 
matter-only simulations (e.g.. iTruiillo-Gomez et al]l201ll '). 

Halo Abundance Matching (HAM) has recently 
emerged as an attractive alternative to HOD in or- 
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galaxies IlKravtsov et al. 


2004: Tasitsiomi et al. 20041: 


Vale & Ostriked 2004 




Conrov et al. I2OO6I: Guo et all 


2OIOI: IWetzel & White 


I2OIOI: iTruiillo-Gomez et alJ I2OIII). 



Abundance-matching resolves the issue of connecting 
observed galaxies to simulated dark matter haloes and sub- 
haloes by setting a one-to-one correspondence between the 
red-band luminosity or stellar and dynamical masses: more 
luminous galaxies are assigned to more massive (sub)haloes. 
By construction, the method reproduces the observed 
luminosity function (or stellar mass function). It also 
reproduces the scale d ependence of galaxy clustering over a 
large range of epochs (jConrov et aLlfeood : IGuo et al.ll201ol ). 
When abundance matching is used for the observed stellar 
mass function (Li fc White 2009,), it gives also a reasonable 
fit to lensing results (jMandelbaum et al.l 20061') and to th e 
relation between stellar and virial mass ICuoelaDlioi^). 
iGuo et al.1 (|2010t ) also attempted to reproduce the observed 
relation between the stellar mass and the maximum circular 
velocity with partial success, finding deviations both in 
shape and amplitude between predictions and observations. 
At circular velocities in the range 100-150 km s"^ the 
predicted circular velocity was ~ 25% lower than the 
observed one. They proposed that this disagreement is 
likely due to the fa ct that they did not include the effect 
of baryons. Indeed, iTruiillo-Gomez et al.1 l|201ll ) show that 
accounting for baryons drastically improves the situation. 

Just like as with HODs, there are different flavours 
of HAMs. Generally, one does not expect a pure mono- 
tonic relation between stellar and dynamical masses. There 
should be some degree of stochasticity in this relation 
due to deviations in the merger history, angular momen- 
tum, and halo concentration. Even for haloes (or sub- 
haloes) with the same mass, these properties should be 
different for different systems, which would lead to devi- 
ations in stellar mass. Observational errors are also re- 
sponsible in part for the non-monotonic relation between 
halo and stellar masses. Most of modern HAM mod- 
els already i mplement prescr i ption s to account for the 
stochasticitv ([Tasitsiomi et al. 20041: Behroozi et al.l I2OI0I : 



iTruiillo-Gomez et al.ll201ll : lLeauthaud et al.ll201ll ). The dif- 
ference between monotonic and stochastic models depends 
on the magnitude of the scatter and on the stellar mass. The 
typical value of the scatter in the r-band is expected to be 
AMr = 0.3-0.5 mag (e.g.. iTruiillo-Gomez et~al]l201ll '). For 
the Milky- Way-size galaxies th e differences are practically 
negligible (|Behroozi et al.|[2010l ). but they increase for very 
massive galaxies such as those targeted with BOSS due to 
the strong dependence of the bias with mass. 

Almost two years after the start of the project, BOSS 
has obtained the spectra of about 487,000 galaxies and 
61,000 quasars. Using the SDSS-III Data Release 9 (DR9) 
BOSS data we present results on the two-dimensional, pro- 
jected and redshift-space correlation functions on scales from 
~ 500 kpc to ~ 90 Mpc including fibre collision cor- 
rections. In order to make predictions for the ACDM cos- 
mological model we use a large high-resolution A'^-body sim- 
ulation with a resolution high enough to resolve subhaloes, 
which is very important for the HAM prescription. When 
connecting haloes with galaxies we use a stochastic HAM 
model. 

This paper is organized as follows. In Section [2] we 
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Figure 1. Sky area covered by the DR9 BOSS-CMASS sam- 
ple shown in Aitoff projection colour-coded by completeness (see 
text). The upper and lower maps display the northern and south- 
ern galactic caps respectively. 



present the BOSS galaxy sample studied here, dubbed 
"CMASS" , and the measurements of the two-dimensional, 
projected and redshift-space galaxy clustering in observa- 
tions. In Section [3] we present the details of the Multi- 
Dark simulation, the halo catalogs and the HAM technique 
adopted here. In Sections |4] and [5] we compare the cluster- 
ing measures with observations and study the occupation 
distribution given by our halo catalog. We also discuss the 
comparison betw een our halo occup ation distribution with 
that obtained bv I White et"aLl (|201lh using an HOD model. 
In Section [6] we study the scale-dependent bias of galaxy 
clustering of the CMASS sample as inferred from our HAM 
model both in real and Fourier space. Finally, in Section [7] 
we close the paper with the summary and conclusions. 

In Appendix[A]we discuss several effects that can affect 
the clustering power. 



2 OBSERVATIONS 

2.1 The CMASS sample 

In this section we introduce the BOSS sample of massive 
galaxies analyzed in this work. The target galaxies are se- 
lected in such a way that the stellar mass of the systems 
is approximately constant over the entire redshift range 
of interest. As a consequence, the resulting galaxy sam- 
ple is usually dubbed 'constant mass' (CMASS) sample. 
These galaxies are characterized by high-luminosities which 
translate in a rather low comoving space density of about 

3 X 10~^ Mpc. The sample can be obtained by apply- 
ing the following colo ur cuts to the observations (see e.g. 



lEisenstein et al.ll20nT ): 
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Figure 2. The comoving number density of galaxies in the DR9 
BOSS-CMASS sample both for the north and south subsamples in 
the redshift range 0.4 < 2 < 0.7. Dashed lines show the smoothed 
distributions used to create the Poisson distribution of particles 
when computing the correlation functions (see text). 



17.5 < icmod < 19.9, 

^mod ^mod 2, 

iflbcr2 < 21.5, 

dx > 0.55, 

icmod < 19.86 + 1.6 X (dx - O.i 



where dx = (rmod - imod) 



'mod ^mod ) 



(1) 

and Zfibcr2 is 



the i magnitude measured with the 2" BOSS fiber withi n 
the SDSS ugriz photometric system (|Fukugita et al.|[l996l l. 
The subscripts cmod and mod denote "cmodel" and "model" 
magnitudes respectively. These cuts are chosen to pick out 
massive red galaxies at z > 0.4. In particular, the condition 
dx > 0.55 selects systems with observed red r — i colours, 
whereas the conditions imposed on the i-magnitude is de- 
signed to identify an approximately complete galaxy sam- 
ple down to a limiting stellar mass. Most of these galaxies 
(~ 75%) show an early-type morphology with a character- 
istic stellar mass of M, ~ 10^^ Mq a nd an absolute 
r-ban d magnitude of Mr — 51og/i < —20.7 (|Masters et al.l 

120111). 

[Schlaflv et all (|2010h and lSchlaflv fc Finkbeineil (|201ll ') 
found systematic offsets between the colours of SDSS objects 
in the southern and northern Galactic hemispheres which 
might reflect a combination of percent calibration errors 
in the SDSS photometry and errors in the corr e ction s for 
Galactic extinction. The ISchlaflv fc Finkbeinerl (|201ll ') re- 
sults suggest a systematic offset in the value of dx of 0.0064 
between the north and south. As the CMASS selection cri- 
teria depends on dx, this offset leads, in principle, to a dif- 
ference in the galaxy samples selec ted for spectrosco pic ob- 
servations in the two hemispheres. IRoss et al] (|201ll ) found 
a 2% difference in the number density of CMASS targets 



between the northern and southern hemispheres, which re- 
duces to 0.3% when this offset is applied to the galaxies 
in the so uth before apply ing the CMASS selection criteria. 
However, et al.l (I2OI2I ) found no appreciable north/south 
colour offsets in their sample. In this work we do not ap- 
ply a colour offset to the selection of CMASS galaxies in 
the south. Although we present results obtained from the 
combined (north-|-south) CMASS sample, we also analyse 
the data from the northern and southern hemispheres sepa- 
rately in order to avoid potential systematics that could be 
associated with the use of slightly different selection criteria. 

For a number of reasons it is not possible to obtain 
reliable redshifts for all the galaxies satisfying the CMASS 
selection criteria (see Section 12. 2p . We estimate the com- 
pleteness c = n^/nt, where nt is the number of galaxy tar- 
gets and Hz the number of these with reliable redshift esti- 
mates (weighted as described in Section [2. 2 [) for each sector 
of the survey mask, that is, the area s of the sky covered 
by a imique set of s pectroscopic tiles ( Blanton ct al. 2001^; 
iTegmark eralll2004h which we characterize using the Man- 
gle software ( Hamilton fc TegmarkI |2004| : ISwanson et al.l 
|2008| ). The average completeness of the combined CMASS 
sample is 98.2%. We trim the final area of our sample to 
all sectors with completeness c ^ 0.75, producing our final 
sample of 282,068 galaxies, of which 219,773 and 62,295 are 
located in the northern and southern galactic caps respec- 
tively. Fig.[T]shows an Aitoff projection of the resulting sur- 
vey mask in the northern (upper panel) and southern (lower 
panel) regions, with effective areas ^leff = '^^CiQ,i, where 
the sum extends over all sectors contained in the mask and 
Q,i corresponds to their solid angles, 2502 deg^ and 688 deg^ 
respectively. The redshift distribution of the CMASS sam- 
ple can be seen in Fig. [5] both for the northern and southern 
subsamples. The dashed lines show the smoothed distribu- 
tions used to create the random samples of points for our 
clustering analysis (see Section I2.2|l . As shown in this fig- 
ure the galaxy number density peaks at z ~ 0.52 having a 



value of ng 
z = 0.55. 



3.6 X 10"* h'^ Mpc"'' and a mean redshift of 



2.2 Clustering measures 

We characterize the clustering of the CMASS galaxy sample 
by means of two-point statistics in configuration space. We 
measure the angle-averaged redshift-space correlation func- 
tion ^(s) and the full two-dimensional ^(cr, tt), where o and 
TT are the components in the direction perpendicular and 
parallel to the line of sight of the total separation vector s. 
These measurements are affected by redshift-space distor- 
tions. In order to obtain a clustering measure that is less 
sensitive to these effects we also comput e the projected cor- 
relation function (|Davis fc Peeble3ll983l ) 



(2) 



(a) = 2 / ^((J,7r)d7r. 



In practice, we sum all pairs with Ttmax < 100 /i ^ Mpc. 

We compute the full c orrelation functions ^(ct, tt) using 
the lLandv fc Szalavl l|l993l ) estimator 

, DD --2DR + RR 
= ^ (3) 

where DD, DR, and RR are the suitably normalized num- 
bers of data-data, data-random, and random-random pair 
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counts in each bin of (a, tt) . In order to measure these quan- 
tities without introducing systematic effects, a few impor- 
tant corrections must be taken into account. Here we give a 
brief description of the main issues that should be consid- 
ered while a mor e detailed discussion will be presented in 
IRoss et all (|2012l ). 

As described in the previous section, the spectroscopic 
CMASS sample is constructed from a target list drawn from 
the SDSS photometric observations. Even though the overall 
completeness of the CMASS sample is high, it is not pos- 
sible to obtain reliable redshifts for all galaxies satisfying 
the selection criteria specified in Section [2.11 Which galax- 
ies are observed spectroscopically is d etermined by a. n adap- 
tive tiling algorithm, based on that of lBlanton et al.| l|2003l ). 
which attempts to maximize the number of measured spec- 
tra over the survey area. As a result of this algorithm, not 
all galaxies satisfying the CMASS criteria are selected as 
targets for spectroscopy. Even when a fibre is assigned to a 
galaxy and a spectrum is observed, it might not be possible 
to obtain a reliable estimation of the redshift of the object, 
leading to what is called a redshift failure. These tend to oc- 
cur for fibres located near the edges of the observed plates. 
This implies that it is not possible to simply consider these 
redshift failures as an extra component affecting the overall 
completeness of the sector since their probability is not uni- 
form across the field. In order to correct for this effect we 
define a set of weights, w^i, whose default value is one for 
all galaxies in the sample. For every galaxy with a redshift 
failure, we increase by one the value of w^i of the nearest 
galaxy with a good redshift measurement. The application 
of these weights effectively corrects for the non-uniformity 
effects produced by redshift failures. 

The main cause for the loss of objects is, howev er, fi- 
bre collisions (jZehavi et al.ll2002l : [Masiedi et aLlbOOel ). The 
BOSS spectrographs are fed by optical fibres plugged on 
plates, which must be separated by at least 62" (in the 
concordance cosmology this corresponds to a distance of 
~ 0.27ft~^Mpc at 2 ~ 0.5). It is then impossible, in any 
given observation, to obtain spectra of all galaxies with 
neighbours closer than this angular distance. The problem 
is alleviated in regions covered by multiple exposures, but it 
is in general not possible to observe all objects in crowded 
regions. 

In this work we correct for the impact of fibre colli- 
sions on our clusteri ng measurements by applying the cor- 
rection presented in IGuo et al] (|201ll ) . Using this method 
the total galaxy sample D is divided into two subsamples, 
dubbed as D\ and D2. These are constructed following the 
targeting algorithm of the catalogue in a way that guaran- 
tees that group D\ is not affected by fibre collisions, while D2 
contains all collided galaxies. Any clustering measurement 
of the combined sample can be obtained as a combination 
of the contributions from th ese two groups. B ased on tests 
on mock galaxy catalogues, IGuo et al.l (|201ll ) showed that 
the application of this method can accurately recover the 
projected and redshift-space correlation functions on scales 
both below and above the fibre collision scale, providing a 
substantial improvement over the commonly used nearest 
neighbour and angular correction methods. 

We constructed random catalogues for subsamples Di 
and D2 for the northern and southern hemispheres with 40 
times more objects than the real data following their respec- 
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Figure 3. Projected correlation function times the projected dis- 
tance for the DR9 BOSS-CMASS galaxy sample in the redshift 
range 0.43 < z < 0.7. The blue and red shaded areas corre- 
spond to the north and south subsamples and give an estimate 
of their standard deviation. The dot-dashed lines display their 
mean value. The result of combining both subsamples is shown 
as filled circles. Standard deviation for the projected correlations 
of all samples are estimated using an ensemble of 600 mock cat- 
alogs (see Section 12.211 . For comparison the projected correlation 
inferred from the fir st semester of the B OSS-CMASS data is also 
shown (open circles: Fwhite et al.ll20l"ll ). 

tive angular completenesses. The redshifts of these random 
points were generated in order to follow the distributions of 
the real samples, which were obtained by a smoothing spline 
interpolation of the observed redshift distributions. 

With the increasing size of current galaxy surveys, and 
the corresponding improvement on the statistical uncertain- 
ties, the contribution of systematic errors to the total error 
budget of any clustering statistic becomes increasingly im- 
portant. Due to its large volume and high num ber density 
BOSS is perhaps one of the best examples of this. lRoss et al.l 
(2012') present a detailed analysis of the systematic effects 
that could potentially affect any clustering measurement 
based on the CMASS sample and show that, besides red- 
shift failures and fibre collisions, other important systemat- 
ics must be considered in order to obtain accurate clustering 
measurements. The main result from this analysis is that 
these systematics can be corrected for by applying a set of 
weights, WsyB, which depend on both, the galaxy properties 
and their positions in the sky. We consider these weights in 
all our clustering measurements. 

Finally, we also include a set of weights to reduce the 
variance of the estimator that are given by 

w^{l + n{z)J^y^, (4) 

where n{z) is the me an galaxy density at redshift z and Jw 
is a free parameter. iHamiltonl l|l993l ) showed that setting 
Jw = 47rJ3(s), where J3(s) = ^(s')s'^ds', minimizes the 
variance on the measured correlation function for the given 
scale s. Here we follow the standard practice and use a scale- 
independent value of J„ = 2 X 10'*. 

Fig. [3] shows the projected correlation functions H((j) 
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times the projected distance of the north, south and com- 
bined CMASS samples. The combined sample gives a sim- 
ilar outcome to that of the north as a result of the higher 
statistics in the latter. For comparison the projected cor- 
relation inferred from a CMASS sample corresponding to 
the first sem ester of the BOSS observations is also shown 
(open circles: Iwhite et al.ll201lb . Besides the increase in the 
sample size and the volume probed, there are differences at 
small and large scales which are probably due to the differ- 
ent corrections for fibre collisions and the use of the weights 
to correct for the systematics affecting the galaxy density 
field. 

Although the projected correlation functions of the 
northern and southern subsamples agree within their respec- 
tive uncertainties, they show some intriguing differences. At 
scales in the range ~ 20-50 Mpc the amplitude of E{a) 
in the south is higher than that of the north. Similarly, the 
measurements of ^(s) show the same behaviour. However, 
in this case, the agreement of the mean values is somewhat 
closer (see section 

To estimate covariance matrices for these clustering 
measures, we use a set of 600 mock catalogs designed to 
follow the same geometry and redshift distribution of the 
CMASS sample wh ile mimicking their clustering proper- 
ties at large scales l|Manera et al.ll2012|). These mocks are 
inspir ed by the PTHalos method of Scoccimarro fc ShethI 
l|2002t) . although there are some important differences. The 
resulting covariances are compatible with the results of A''- 
body simulations. For a detailed description about these 
mocks and their comparison with A'^-body results (see 
iManera et a"Lll2012lfl 



3.1.1 Halo finding 

Dark matter haloes are identified in the simulation with a 
parallel version of the Bound-Den s ity-Maxima (BDM ) al- 
gorithm (jKlvpin fc Holtzman|[l997l : iRiebe et al.ll201ll ). The 
BDM is a Spherical Overdensity (SO) code. It finds all den- 
sity maxima in the distribution of particles using a top-hat 
filter with 20 particles. For each maximum the code esti- 
mates the radius within which the overdensity has a speci- 
fied value. Among all overlapping density maxima the code 
finds the one having the deepest gravitational potential. The 
position of this maximum is the centre of a "distinct" halo, 
which is a halo whose centre is not inside the virial radius 
of a bigger one. Distinct haloes are also tracers of central 
galaxies. Self-bound haloes with more than 20 particles lying 
inside the virial radius of a distinct halo are classified as sub- 
haloes. Subhalo identification is more subtle since it requires 
the removal of unbound particl es and identification of fake 
satellites. See lRiebe et al.l (|201ll ) for a more detailed descrip- 
tion of the algorithm. The BDM halo finder w as extensively 
tested an d compare d with other halo finders (|Knebe et al.l 
[2011; Bc hroozi et al. 20il). In Appendix \K\ we show a com- 
parison between the real-space correlation function for halo 
catalogs selected both with BDM and RockStar halo finders 
(see FigElJ. The BDM halo catalogs for the MDRl sim- 
ulation are publicly available at the MultiDark Database: 
jhttp:/ /www. multid ark.org 

The size of a distinct halo can be defined by means of 
the spherical radius within which the average density is A 
times higher than the critical density of the Universe, pcr{z). 
As a consequence, the corresponding enclosed mass is given 
by 



3 CLUSTERING IN THE ACMD MODEL 
3.1 The MultiDark simulation 

The MultiDark run (MDRl) is an iV-body cosmologi- 
cal simulation of the ACDM model that was done using 
the A d aptive-Refinement-T r ee (AR T) code (iKravtsov et al.l 
1 19971 : iGottlober fc KlvpinI 120081 ). The simulation has 
2048^ « 8.6 X 10^ dark matter particles in a box of 
1 Gpc on a side. The mass of the dark matter particle is 
8.72 X 10® ^0- The cosmological parameters adopted in 
the simulation are c onsistent with the latest WMAP7 results 
l|jarosik et al.ll2011) and with o ther cosmological probes (see 
Table 1 of iKlvpin et ahlboill ). Hence, we adopt a matter 
density parameter = 0.27 and a dimensionless Hubble 
parameter h = 0.7. Initial conditions were set at the red- 
shift Zinit = 65 using a power spectrum characterized by a 
scalar spectral index Us = 0.95 and normalized to og, — 0.82 
in the same way as done for the Bolshoi simulation (see 
iKlvpin et al.l [201ll . for a detailed description of this simu- 
lation). The ART code is designed in such a way that the 
physical resolution is nearly preserved over time with a value 
of ~ 7 kpc for the redshift range between z = 0-8. For 
fur ther details on the ART code and MultiDark simulation 
see iPrada et al.] l|201ll) and references therein. 



^ Mocks will be available in |http: / / www.marcmanera.net / mocks /j 
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We use a threshold overdensity of A = 200 that corresponds 
to values for halo mass and radius of M200 and R200 ■ BDM 
catalogs also provide virial masses and radius (Mvir and 
-Rvir) defined using the standard overdensity 360/9back(z) 
(background mean density). 

One of the most important characteristics of a (sub) halo 
is its maximum circular velocity: 



Vmax = max 



GM{< r) 



(6) 



There are several advantages of using V^max to characterize 
a halo as opposed to the "virial mass". First, Knax does 
not have the ambiguity related with the definition of mass. 
Virial mass and radius vary depending on the overdensity 
threshold used. For the often-employed overdensity 200 and 
"virial" overdensity thresholds, the differences in definitions 
result in changes in the halo radius from one definition to 
another and, thus, in concentration, by a factor of 1.2-1.3, 
with the exact value dependent on the halo concentration. 
Second and more important, the maximum circular velocity 
Vmax is a better quantity to characterize haloes when we 
relate them to the galaxies inside these haloes. For galaxy- 
size haloes the maximum circular velocity is defined at a 
radius of ~ 40 kpc, i.e., closer to the sizes of luminous parts 
of galaxies than the much larg er virial radius, wh ich for the 
Milky- Way halo is ~ 250 kpc (|Klvpin et al.ll2002l ). 
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Figure 4. Bottom panel: The cumulative number density of dis- 
tinct haloes (dashed line) and subhaloes (dotted line) in the Mul- 
tiDark simulation at z = 0.53 as a function of maximum circular 
velocity. The cumulative number for all haloes is also shown as a 
solid line. Top panel: The cumulative subhalo fraction as a func- 
tion of halo maximum circular velocity. As a reference we indicate 
in both panels the mean number density of the BOSS-CMASS 
galaxy sample and as vertical lines the corresponding maximum 
circular velocity threshold (Vcut) used in the HAM procedure. 

3.2 Bridging the gap between galaxies and haloes 

Once we have the maximum circular velocities for distinct 
haloes and subhaloes the implementation of the HAM pre- 
scription is simple. We start with a monotonic assignment. 
We count all haloes and subhaloes, which have maximum cir- 
cular velocity Vmax larger than 14ut, and gradually decrease 
the value of Vcut until the number density of (sub)haloes is 
equal to that of BOSS galaxies at 2: f» 0.5. 

The bottom panel of Fig. [4] shows the number density 
of (sub)haloes in the MultiDark simulation at 2: = 0.53. 
A number density close to that of the BOSS-CMASS sam- 
ple corresponds to haloes and subhaloes with a maximum 
circular velocity above 362 km s~^, which is larger than 
the completeness limit of the MultiDark simulation, i.e., 
~ 180 km s"^ This means that haloes and subhaloes hosting 
BOSS-CMASS galaxies are well resolved. The top panel of 
Fig. Ushows the cumulative subhalo fraction as a function of 
maximum circular velocity. For values of Vma.^ > 350 km s~^ 
the subhalo fractions are typically less than 10%. We will re- 
turn to this point again in Section [S] 

3.2.1 Halo stochasticity 

There are a number of arguments why there should be some 
degree of stochasticity in the stellar mass - circular velocity 
relation (e.g., Tasitsiomi ct al. 2004; Be hroozi et al.l [2OI0I : 
iTruiillo-Gomez et all I2OI1I ). In our case the stochasticity 
means that some haloes above the velocity cut host galaxies 
with stellar masses smaller than the corresponding stellar 



mass cut of the BOSS sample and should not be included 
into the sample. Simultaneously, some smaller haloes may 
host galaxies with a larger stellar mass, and should be con- 
sidered. Because the number density of galaxies is fixed by 
observations, the numbers of included and excluded haloes 
must be equal. Following Tr uiillo-Gomez et al.l (|201lh we 
implement this process using a Gaussian spread with an off- 
set. If Vcut is the velocity cut in the monotonic assignment, 
then a (sub)halo is taken if its maximum circular velocity 
Vmax satisfies the condition 

Vmax [1 + Af{0, a)] ~AV> Vcut, (7) 

where A/'(0, a) is a Gaussian random number with mean 
zero and rms a. The offset AV is needed to compensate 
the larger influx of smaller haloes. We use a = 0.2 and 
A V = 18km s~^, which is sim ilar to the values adopted 
by iTruiillo-Gomez et al] (120111 ). Note that the offset AV 
and the spread a are not free parameters. The offset is just 
a normalization. The value of a is defined by the spread 
of the observational Baryonic TuUy-Fisher relation (or its 
equi valent for early-type galaxie s), which has uncertainties 
(e.g.. lTruiillo-Gomez et "allboill ). The stochastic assignment 
has a very small effect on clustering for scales larger than 
0.5 Mpc decreasing the correlation functions no more 
than ~ 8%. 



3.2.2 Subhalo tidal stripping 

In order to apply our HAM technique we use the maximum 
circular velocity at z = as a proxy, which is a quan- 
tity that can be easily measured for haloes in our simu- 
lation. However, it is generally accepted that, for subhaloes, 
a better characteristic would be the peak value of the maxi- 
mum circular veloc i ty, Vpeak, during subhalo ev olution (e.g., 
IConrov et alll2006l : muiillo-Gomez et al.ll201ll ). The latter 
is motivated by the tidal stripping effect: once a halo falls 
into the potential well of a larger one some of its material 
can be stripped away, thus lowering the value of Vmax. Since 
in real galaxies stars occupy the inner regions of subhaloes, 
where tidal forces are much weaker, their circular velocities 
should be, in general, less influenced by this effect. 

We expect that the tidal stripping for BOSS-CMASS 
satellites, though present, not to be dominant, thus allowing 
us to use Vmax for subhaloes as a realiable proxy for the 
HAM technique. In this case, satellites with masses of 
10^"^ Mq are typically located at large distances from 
their central hosts, which can reach even larger mass values 
of ~ 10"-10^^ h-^ Mq (see Section 

To estimate the magnitude of the potential stripping 
effect in these systems we run a series of simple simulations. 
Using a direct-summation A'^-body code, we study the ide- 
alized case of a satellite orbiting its central host, where the 
latter is modeled as a static N avarro-Frenk- White (NFW) 
potential l|Navarro et al.|[l996l ). Initially, the satellite was 
set as a distribution of particles following an equilibrium 
NFW distribution with isotropic velocities. The mass per 
particle and force softening were set to 8 x 10^ Mq 
and 0.1/i~^kpc respectively. Particle mass decreases with 
decreasing distance to the central halo as a way to achieve 
a better mass resolution in the central regions. In order to 
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Figure 5. Contours of the two-dimensional correlation function 
i{a, tt) estimated from the DR9 BOSS-CMASS north galaxy sam- 
ple (dashed contours) at 0.4 < 2 < 0.7 and for our MultiDark halo 
catalog constructed using the HAM technique at 2 = 0.53 (solid 
contours). 



check for equilibrium stability and numerical effects we did 
a test run for an isolated satellite, i.e. without considering 
an external tidal field, finding that its maximum circular ve- 
locity was well preserved during the entire evolution of the 
system, which was set to 5 Gyr. 

We study two different cases for a satellite of mass 
Afsat = lO'^^ hr^ alternatively assuming either A/host ~ 
10" hr^ M0 or Mhost = lO^'' hT^ for the mass of the 
central ho st. Halo conc e ntrat ions were selected to follow the 
results of IPrada et al.l (|201ll ). thus they were taken to be 
Cvir = 8.2, 6.9, 5.8, in order of increasing halo mass. Strip- 
ping severely depends on the distance to the centre. For 
instance, for a central system with mass 
Mq, we find that a satellite with a pericentre (apocentre) 
of 100 (500) kpc loses around half of its maximum circu- 
lar velocity in 5 Gyr. However, this is not typical of satel- 
lites in large galaxy clusters. We find that, for both host 
halo masses, satellites falling from the virial radius with 
apocentre-to-pericentre ratios of ~ 4 : 1 - 3 : 1, the tidal 
stripping is much less efficient, changing its maximum cir- 
cular velocity only by 15-20% after 5 Gyr. The effect is 
much smaller after the first ^ 2 Gyr of evolution producing 
a variation less than 5%. 

Since, in this work, the minimum studied physical scale 
is > 0.5 Mpc, it is expected that most of the BOSS-CMASS 
satellites have spent most of their time at larger distances 
from their central haloes, where the impact of tidal forces is 
small. Thus, considering the relatively small change of the 
subhalo maximum circular velocities due to tidal stripping, 
we use Knax as a proxy for ou r HAM instead of Vpcak- Inter- 
estingly, IWatson et al.l (|2012l ) , based on a subhalo evolution 
model applied to clustering measurements in the SDSS, sug- 
gest that tidal stripping of stars in luminous galaxies is much 
less efficient than in less luminous systems, which provides 
additional support to our choice. 



3.3 Modeling BOSS-CMASS clustering 

We use the MultiDark BDM catalogs constructed for the 
overdensity 360pback(2) to facili tate the compariso n with 
the HOD modehng presented in IWhite et al.l (|201ll '). How- 
ever, as stated before, our results do not depend on halo 
mass definition since halo matching is done using the maxi- 
mum circular velocity Knax of either distinct haloes or sub- 
haloes. We use redshift z = 0.53, which is close to the peak 
value of the BOSS-CMASS sample (see Fig. O. 

To model the effect of galaxy peculiar velocities in the 
redshift measurements, we transform the coordinates of our 
(sub)haloes to redshift-space using s = x -I- ■v-r/{aH), 
where x and v are their position and peculiar velocity vec- 
tors respectively, a is the scale factor and H is the Hubble 
constant. We compute the two-dimensional correlation func- 
tion f((j, tt) of our catalog counting the number of "galaxy" 
tracers in bins parallel (tt) and perpendicular (a) to the line- 
of-sight. When estimating the projected correlation func- 
tion, we count all pairs along the parallel direction out to 

VTmax ~ lOOft-^MpC. 

To estimate the cosmic variance we use a set of simula- 
tions from the Large Suite of Dark Matter Simulations [Las- 
Damas; see 'http: / /Iss.phy. vanderbilt.edu/lasdamas/ ) . We 
use mock galaxy catalogs extracted from the Carmen boxes, 
which are 40 dark matter-only low-resolution runs done with 
1120'' particles in a periodic cube with 1 Gpc on a side. 
In this way, we can get a simple estimate of the expected rms 
deviations from our fiducial MultiDark result due to random 
fluctuations in the intial conditions of the universe. The dark 
matter density and scalar spectral index of the Carmen sim- 
ulations display a difference of about 8% in comparison to 
the corresponding values of MultiDark. However, since here 
we only want to obtain an estimate for the magnitude of the 
cosmic variance, we consider this approach as good enough 
for this purpose. 

As already mentioned in Section [2. 2 1 to estimate the co- 
variance matrices of observed correlation functions we use a 
set of 600 galaxy mocks designed to follow the same geome- 
try and redshift distribution of the CM ASS sample , while 
mimicking its clustering properties. iManera et al.l (|2012D 
show that the covariances for the correlation functions of 
A'^-body simulations are consistent with those resulting from 
the mocks. This means that it is safe to compare the cos- 
mic variance of MultiDark (estimated from the Carmen set 
of simulations) with that resulting from the mock galaxy 
catalogs. 



4 CLUSTERING OF GALAXIES IN THE 
BOSS-CMASS SAMPLE 

The two-dimensional correlation function ^(cr, tt) for the 
north subsample of BOSS-CMASS is presented in Fig. \5\ 
for distances up to ~ 20/i~^Mpc (dashed contours). The 
Finger-Of-God elongation along the line-of-sight direction 
at small perpendicular separations, which is due to galaxy 
small-scale random velocities, is clearly seen. The flattening 
of contours at larger projected scales is d ue to the Kai ser ef- 
fect caused by large-scale infall velocities (|Kaiseij|l987i ). Pre- 
dictions for the clustering of galaxies obtained from the Mul- 
tiDark cosmological simulation (solid contours) produce a 
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Figure 6. Left panel: Projected correlation function for tlie 0.4 < z < 0.7 DR9 BOSS-CMASS north, soutli and Full galaxy samples 
(open blue triangles, open red circles and filled black circles respectively) and the MultiDark catalog selected with the HAM procedure 
at 2: = 0.53 (solid line). The shaded area for MultiDark gives an estimate of the cosmic variance. BOSS-CMASS error bars were 
estimated using an ensemble of 600 mock galaxies (see Section l2.2p . For clarity, only error bars for the combined sample are shown. The 
corresponding ones for the north and south are a factor of about 1.13 and 2.15 times larger respectively. The transition between the 
one-halo and two-halo terms can be seen at ~ 1 Mpc. Flattening of the signal at intermediate scales and bending at large scales are 
also evident features. Right panel: Detailed differences between the ACDM model and BOSS clustering are better seen when plotting the 
quantity H((t) a. 




Figure 7. Left panel: Redshift-space correlation function for the 0.4 < z < 0.7 DR9 BOSS-CMASS north, south and Full galaxy samples 
(open blue triangles, open red circles and filled black circles respectively) and the MultiDark catalog selected with the HAM procedure 
at 2 = 0.53 (solid line). Standard deviation for model and observations are shown in the same way as in Fig. [B] Right panel: Shown is 
the quantity §(s) which better reflects the differences between our ACDM model and BOSS clustering measures. 



fair representation of the measured clustering in the CMASS 
sample. Nevertheless, there are some deviations. At small 
separations, cr < 2 Mpc, observations show more clus- 
tering as compared with results from the simulation. The 
situation reverses at large scales, where our cosmological 
simulation predicts slightly stronger clustering. 

These tendencies are clearly seen in the correlation func- 
tions presented in Figs. [6] and [T] The north, south and com- 
bined CMASS samples are shown together with the result 
of our simple HAM model. The shaded area for MultiDark 



gives an estimate of the cosmic variance as computed from 
LasDamas suite of simulations. Again, the overall agreement 
at all scales is quite good showing a remarkable match with 
observations. However, as noted before, there are some no- 
ticeable discrepancies at small and intermediate scales. The 
detailed differences between the projected correlation func- 
tion and MultiDark can be better seen in the right panel 
of Fig. El where differences in the correlations are ampli- 
fied after multiplying by the corresponding projected dis- 
tance. The disagreement at scales < 1 h^^ Mpc is perhaps 
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Figure 8. The mean occupancy of all haloes in our MultiDark 
sample used to match the BOSS-CMASS observations as a func- 
tion of halo mass (open squares). Open circles and dashed line 
correspond to satellites and central haloes respectively. Error bars 
are calculated assuming Poisson statistics in the counting. The fit 
given by Eq. JSjl is shown as a dot-dashed line. 



related to the simple stochastic HAM adopted here. At large 
scales, starting from ~ 20 Mpc, the theoretical estimates 
he slightly above the observational estimates of the north- 
ern galaxy subsample (at ~ la level), which has about four 
times larger statistics than the corresponding southern sam- 
ple. 

The redshift-space clustering results, both for the 
CMASS sample and the ACDM model given by the Multi- 
Dark simulation, are shown in Fig. [T] As before, the shaded 
area represents cosmic variance estimates and differences be- 
tween model and observations are better seen in the right 
panel. Peculiar velocities of galaxies inside virialized systems 
reduce the clustering signal thus lowering the slope of the 
correlation function at scales of 1-2 Mpc. 

For scales in the range ~ 0.6-1 h-'^ Mpc our model 
underpredicts the observed values, as already showed for 
the case of the projected correlation function. The agree- 
ment between the simulation measurement and observed 
redshift-space correlation function is quite remarkable at 
scales > 1 Mpc. Differences are less than 2-3% for a wide 
range of distances ranging from 2 Mpc to 20/i~^Mpc. 
At 20-40 Mpc the MultiDark results overpredict the ob- 
served clustering by about ~ 10%. Statistically the differ- 
ences are significant: the effect is about 3a at ^ 30 Mpc 
(e.g., at s = 33.5 /i"'^ Mpc the redshift-space correlation 
function for the combined CMASS sample and MultiDark 
give Cn+s(s) = 0.077 ± 0.004 and 5md(s) = 0.091 ± 0.003, 
respectively). The small differences between the A^'-body re- 
sults and observations may be alleviated if we use a more 
sophisticated HAM procedure including, for instance, light- 
cone effects and a match to the stellar mass distribution 
at these redshifts. Nevertheless, the high level of agreement 
found between data and observations using the simple HAM 
procedure adopted here is a striking result. 



Figure 9. MultiDark HOD parameters, Mcut and Mi, as a 
function of number density (solid line) using the simple HAM 
prescription at z = 0.53. We compare our results with a vari- 
ety of intermediate rcdshift massive galaxy samples. The data 



are taken from 'Phlcps ct al. ( 2006 


), Mandclbaum ct al. 


(200d), 


Kulkarni ct al.' (2007), Blake ct al. 


(2008), Brown ct al. 


{200%, 


Padmanabhan ct al. (2009), Wake ct al. (2008) and Zheng et alj 



Filled circles show results from lWhite et aT l( l20lj) HOD'S 
analysis of early BOSS data (see text). 



5 THE MEAN HALO OCCUPANCY OF 
BOSS-CMASS GALAXIES 

Our analysis allows us also to study the halo occupation dis- 
tribution and the satellite fraction of BOSS-CMASS galax- 
ies at 2 ~ 0.5. The main advantage of the MultiDark sim- 
ulation is that it has sufficient resolution to resolve satel- 
lites around central distinct haloes. The satellite distribu- 
tion around massive haloes can be directly studied from the 
resulting halo catalogs. As shown previously in the top panel 
of Fig. |4l the fraction of satellites for haloes with a number 
density close to that of the CMASS sample is less than 10%. 
In particular, for haloes having Vmax ^ 362 km s~^, which 
corresponds to a number density of 3.6 x 10~* h~'^ Mpc"^, the 
resulting satellite fraction is 6.8%. The HOD modeling by 
IWhite et al.l (|201ll ). using the first semester of BOSS data, 
reported a satellite fraction (10 ± 2)% which is reduced to 
(7±2)% when they ignore in their fit to the correlation func- 
tion the very small scales affected by fibre collisions. Note 
that our HAM procedure is non-parametric and provides 
satellite fractions consistent with our ACDM cosmological 
simulation. Yet, the halo-occupancy distribution and satel- 
lite fractions from HOD modeling are obtained from a fit to 
the empirical correlation function. 

Fig. [S] shows the mean occupancy of haloes for the 
BOSS-CMASS sample as obtained from the MuhiDark halo 
abundance-matching scheme. The dashed line and open cir- 
cles are the contributions of distinct haloes and subhaloes re- 
spectively. Open squares correspond to the total occupancy 
of haloes, including both central and satellite galaxies from 
our halo catalog. Distinct haloes display a sharp transition 
around A/vir > lO" Mq. The mean number of satellite 
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Figure 10. Scale-dependent galaxy bias a.t z = 0.53 predicted 
for BOSS-CMASS galaxies using the MultiDark simulation. The 
solid curve shows the bias relative to the dark matter (Eq. J9)). 
The bias relative to the linear-theory predictions is shown as a 
dashed line. 



galaxies as a function of h alo mass can be model ed with the 
following expression re.g.. lWetzel fc Whitdlioiol ) 



iVsat(M) 



M 



- Aleut /A/i 



(8) 



where logMcut = 13.07 ± 0.40, logA/i = 14.25 ± 0.17 and 
Q = 0.94 ± 0.42 (dot-dashed line). Here, Mi is the halo 
mass which hosts, approximately, one satellite and A/cut gov- 
erns the strength of the transition between systems with and 
without satellite systems. For high halo masses, fluctuations 
in the determination of the satellite occupancy arise because 
we are dealing with small number statistics as a result of the 
fixed volume of the simulation. The solid line in Fig.[S]shows 
the total mean halo occupancy but using in this case the best 
fit model for the satellite distribution in order to extrapolate 
the result towards higher masses. 

In Fig. [5] we compare the HOD parameters, A/cut and 
Ml, obtained from MultiDark at z = 0.53 as a function of 
number density (solid lines) following our HAM scheme. We 
also show estimates for a variety of intermediate redshift 
massive galaxy samples from the literature, including the 
HOD results from White et al. (2011) for the early BOSS 
data sample. This compilation of different datasets has been 
kindly provided by M. White. Error bars on the individual 
points are typically ~ 0.1 dex, as represented by the size of 
the symbols. The agreement between the MultiDark HAM 
predictions and data from different surveys is remarkable 
if one considers the differences in sample selection, redshift 
range and HOD methods. Our estimates for the HOD pa- 
rameters of the BOSS-CMASS sample yield consistent val- 
ues with those of White et al.'s HOD analysis which are con- 
tained wit hin our erro r bars. Nevertheless, it is worth noting 
here that i White et al] {2011) did not consider weights in the 
estimation of the correlation function used, which could have 
an impact on the derived parameters, and that our approach 
relies completely on our halo catalog. 



6 POWER SPECTRUM AND BIASES 

In this section we focus on the abundance- matched halo cat- 
alog to the BOSS-CMASS galaxy sample, presenting fur- 
ther predictions from the MultiDark simulation that can 
be tested with future obervations. Using the resulting halo 
sample and dark matter particles from the simulation we 
can estimate the bias of the halo population with respect to 
the underlying mass distribution as follows 



&(r) 



Ch(r-) 
Cm(r) 



(9) 



where ^h(f') and Cm('") are the real space correlation func- 
tions for the MultiDark haloes and dark matter in the vol- 
ume at the redshift of interest. This bias is shown in Fig. [10] 
as a function of spatial scale (solid line). The resulting bias 
is 6 ^ 2 at the transition scale of ~ 1 h^^ Mpc and, as ex- 
pected, increases strongly for smaller scales where galaxies 
are more strongly clustered with respect to the dark matter. 
For the remaining scales we can constrain the bias factor to 
be in the range b ~ 1.8-2.2. Interestingly, this result is i n 
very good agreement with the findings of Ho et al.l (|2012f ). 
These authors found a galaxy bias of 6 = 1.98 ± 0.05 in the 
redshift range z — 0.50-0.55 by studying the angular clus- 
tering of the photometric CMASS sample. The bump-like 
feature between ~ 1-10 Mpc is related to the transi- 
tion between the one- and two-halo terms in the correlation 
function, while for larger scales the bias factor tends to de- 
crease. The linear bias estimation is shown as a dashed line, 
where the linear matter correlation function is used instead. 
As expected, the linear bias at small scales differs strongly 
from the non-linear result while approaching more similar 
values at larger scales. 

We have a number of goals with the analysis of the 
power spectrum and biases: (1) We want to present accurate 
approximations for the numerical results, which can be used 
for comparison of observational results with predictions of 
the cosmological model used in our simulations. It is more 
convenient to use these approximations instead of having to 
deal with raw simulations. (2) The high quality of our results 
allows us to study effects which are difficult to measure with 
low-resolution simulations. 

One should clearly understand the role of the standard 
ACDM model with the particular set of cosmological param- 
eters used for our simulations. Our results show that, once 
we match the abundance of haloes, the model reasonably 
reproduces a wide range of scales of the observed projected 
and redshift-space correlation functions. In principle, one 
can invert the correlation function to obtain the power spec- 
trum. However, in practice, a model-independent inversion 
is a technically complicated process. This is why we chose a 
different approach: we use the power spectrum of haloes in 
the model as a proxy of the power spectrum of galaxies in 
BOSS. 

We use two other sets of simulations in addition to 
MultiDark. The first one is the already mentioned Carmen 
series of 40 simulations of the LasDamas suite of simula- 
tions that allow us to estimate the effect of cosmic vari- 
ance. These mock galaxy catalogs are produced with an 
HOD model with parame ters aimed at fitting the respec- 
tive SDSS galaxy samples l|McBride et aLlbOOgl ). Note that, 
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k (hMpc-') k (hMpc-') 

Figure 11. Left panel: Recovering the power spectrum: shot-noise and density assignment corrections. The top soUd thin curve shows 
the "raw" estimate of the power spectrum at z = 0.53 for haloes and subhaloes with circular velocities larger than Vmax > 362 km s~^ 
corresponding to a number density close to that of galaxies in the BOSS sample n = 3.6 X 10"'' Mpc~^. The dot-dashed line is the 
combined correction in Eq. I|10|l due to the shot-noise and the density assignment. The vertical line shows the Nyquist frequency. The 
thick solid line is the recovered power spectrum. The dashed line shows the linear power spectrum of dark matter density perturbations 
scaled up to match the amplitude of the recovered power spectrum at long waves. Right panel: Comparison between the recovered power 
spectra for haloes-|-subhaloes with Vmax > 200 km s~^ in the MultiDark (solid line) and the Bolshoi (dashed line) simulations at ^ = 0. 
Deviations at fc < 0.1/iMpc~^ are due to cosmic variance. The deviations at fc > 5/iMpc~^ are due to density assignment effects in 
the MultiDark simulation. However, for wave-numbers in the range 0.2/iMpc~^ < fc < 5/iMpc~^ the resulting power spectra are not 
affected by cosmic variance and resolution and the agreement between simulations is excellent, with deviations less than just few percent. 



as before, we use only relative model-to-model deviations in 
the Carmen simulations: error bars in our results are ob- 
tained in this w ay. Secondly, we als o use results of the Bol- 
shoi simulation (jKlvpin et aLlfioill ). This simulation has a 
factor of ~ 5 better mass and force resolution, but it was 
performed for a smaller simulation box (250 Mpc on a 
side). There is an overlap between the MultiDatk and Bol- 
shoi simulations: the simulation volume of Bolshoi is large 
enough to study haloes (and subhaloes) with circular veloc- 
ities of ~ 200 km s~^. At the same time, these (sub)haloes 
are reasonably well resolved in the MultiDark simulation 
having more than 100 particles. Comparison of MultiDark 
and Bolshoi power spectra for these haloes allows us to look 
for biases at scales k > 0.1 /iMpc~^. 

To estimate power spectra, we use a large density mesh 
of 4096^ cells and then we apply the standard FFT method. 
The Cloud-In- Cell density assignment scheme is used to cal- 
culate the density fields from the coordinates of haloes in the 
simulations. However, before the power spectra can be re- 
liably used two corrections shou ld be appli ed: a correction 
due to the density assignment (|jing| |2005| ) and the usual 
shot-noise correction. If the number density of objects is 
n = N/L^ and the Nyquist wave-number is /cNy ~ irA'grid/I', 
then the corrected power spectrum is given by 



P(fc) = P.aw(A:)-- 
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(10) 



1000 



500 



3i 




100 



where L is the length of the computational box and A'grid = 
4096. This approximation is known to work well for k < 



(hMpc- 



Figure 12. Power spectra (multiplied by fc^'^) of dark matter 
haloes in real space (open circles with error bars) for haloes with 
Vmax > 362 km s~^ (top) and Vmax > 180 km (bottom). 
Solid curves show the linear power spectra scaled to match the 
amplitude of fluctuations at long waves. The four vertical lines 
indicate the positions of maxima due to BAO. The BAO peaks 
in the linear spectrum give rise to peaks in the power spectrum 
of haloes. 
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Figure 13. Bottom panel: Real-space bias factor b{k) = 
(^gg/^linoar)'^''^ for haloes with circular velocities 14nax = 
180, 200, 220, 250, 300 and 362 km s'^ (from bottom to top). Top 
panel: Bias factor for different lialoes normalized to unity at long- 
waves. The bias factor b{k) depends on the circular velocity Vmax 
and on wave-number A: in a rather complicated way. There are 
small depressions in the bias factor at peaks of BAOs. When 
normalized to the long- wave value bg, the bias factor is slightly 
smaller for less massive haloes. However, the main effect is the 
overall shift bn. 



0.7feNy (|Jinell2005l : ICui et al-lboOSf ). However, to remain on 
safe ground we decided to limit our analysis to < 0.4A;Ny ~ 
5 h Mpc~^. The left panel of Fig. 1 1 1 1 illustrates the procedure 
of shot-noise and density corrections using a halo sample 
with T/max > 362 km s~^ extracted from the MultiDark sim- 
ulation at z = 0.53. 

In the right panel of Fig. [11] we compare results of the 
MultiDark and Bolshoi simulations. Just as one may ex- 
pect, there are some deviations at long waves due to the 
cosmic variance: the Bolshoi box of 250 Mpc is too small 
to accurately probe these waves. There are also deviations 
at short waves that correspond to fc > 7/iMpc~^ that are 
mainly due to the difference in density assignment between 
both simulations. For the Bolshoi simulation, the adopted 
mesh sets a minimum physical scale four times higher in 
frequency in comparison to MultiDark. However, for wave- 
numbers in the range 0.2/iMpc^^ < k < 5/iMpc~^ the 
agreement between the simulations is remarkably good. This 
agreement is especially important for short waves, where 
both resolution and shot-noise could have corrupted the re- 
sults. However, since this has not happened, it indicates that 
the obtained power spectrum for MultiDark can be trusted 
up to, at least, fc = 5 /iMpc~^. 

Fig. [12] shows power spectra of haloes with circular ve- 
locity cuts Vmax > 362 km s^^ (top curves) and Vmax > 
180 km s"^ (bottom curves). To highlight BAO features, 
we actually plot the power spectra of the halo distribution 
multiplied by fc'^'^. As a result, the first four peaks in the 
spectra are clearly seen in the plot. However, they are some- 
what smeared out by the non-linear evolution. As expected. 
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Figure 14. Real-space bias factor for haloes with circular veloc- 
ities larger than Vmax = 362 km s~^. Top panel: The bias factor 
normalized to the long-wave value bg (bottom panel, solid line) 
is compared with the analytical approximation given by Eq. I I15II 
(dashed line). The top panel displays the relative error in percent- 
ages of the analytical approximation (filled circles). Error bars 
show the rms fluctuations due to the cosmic variance. Bottom 
panel: Deviations of the bias from the "de-wiggled" component 
of the bias factor given by Eq. I I15I I .Open circles show the relative 
deviations b(fc)/bno— wiggle ~ f for each wave-number. The solid 
line is an analytical model for the residuals: the sum of exponen- 
tial terms in Eq. 1151 1. Error bars show the rms fluctuations due 
to cosmic variance. 
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the smearing increases for larger wave-numbers where the 
non-hnearity is more important. 

In what follows, we define the bias factor by 



b{k, Vm 



Pgg{k, Vm 



Pi incar 

(fe) 



1/2 



(11) 



where Piincar(fc) is the linear power spectrum of the dark 
matter and Pgg(fc, Vmax) is the power spectrum of haloes 
and subhaloes with circular velocities larger than Vmax- In 
order to distinguish the latter from the often used non-linear 
dark matter power spectrum or from the power spectrum of 
distinct haloes, we use subscript "gg" to indicate that our 
results mimic galaxies. 

We start our analysis with the long-wave normalization 
of the bias parameter for different velocity cuts and, thus, 
for different number-densities of our "galaxies" . The bottom 
panel in Fig. [13] shows b{k, Knax) for different velocities. At 
all wave-numbers the bias b(k, Vmax) increases with increas- 
ing Knax- The top panel shows that when normalized to 
the long- wave value, 6o(VInax), the bias factor is nearly the 
same. However, there is some residual dependence on Vmax, 
i.e., the deviations of the bias from one velocity cut to an- 
other can be as large as 15% and this should be taken into 
account if an accurate fit is needed. An approximation for 
the real-space long-wave bias factor as a function of the av- 
erage number density of dark matter haloes n(> Vmax) or 
VInax is presented below: 



bo(n) = 1 + 0.57 log 



fe()(Vm 



1 + 



Vm 



2.05 X lO'^/i^Mpc" 



4/3 



361 km s 



,(12) 



(13) 



We now focus our analysis on the bias factor of haloes 
with Vmax > 362 km s~^ at z = 0.53, whose abundance 
n = 3.6 X 10"* Mpc"^ matches that of BOSS galax- 
ies. The top panel of Fig. [14] shows the bias factor of these 
haloes normalized to the value 6o(Vmax) ~ 2.01 found at long 
waves. Overall the bias factor is nearly fiat at long waves 
and monotonically increases to short waves. The following 
approximation for the smooth component of the real-space 
bias factor gives an accuracy better than 4%: 

b{k, V„,ax) = 6o(Vmax) [l + logio(l + Sk^ "" + 5.8k^)] , (14) 

where the wave-number k is in units of hMpc"^ . How- 
ever, this approximation misses an important effect of 
non-linearities, the damping of the BAO. The cou- 
pling between different Fourier modes washes out the 
acoustic oscillations , erasing the higher ha rmonic peaks 
jMeiksin et al.lll999l: lEisenstein et al. 2007bl: lAngulo et al.1 



I2OO5I . I2OO8I : ISanchez et al.l l2008l : iMontesano et al.l l2010h 

In recent years, there has been substantial progress in 
the theoretical understanding of non-linear distortions in 
the BAO signal, which can now be accurately modelled 
(see e 



l2008a| 



aignai, wnicn can now be a c curate l y modelled 
Crocce fc Scoccimarrd l2006l . l2008l : iMatsubaral 



■ _ iTaruva et al.1 120091') . and even pa rtially corrected 

for l|Eisenstein et al.ll2007al : lSeo et al.ll2010l 'l. As the bias fac- 
tor in Eq. Hll[) is defined with respect to the extrapolated 
linear theory power spectrum, this damping leads to small 
wiggles in b{k) with an amplitude at the 2-4% level, that 
can be better seen in the bottom panel of Fig. 1141 



Table 1. Parameters for the approximation of the real-space bias 
factor given by Eq. II15I I. 



BAO peak 


k {h Mpc-i) 
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0.071 


0.010 


0.017 
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0.130 


0.043 


0.017 


3 


0.191 


0.022 


0.017 


4 


0.251 


0.013 


0.012 



The accuracy of the fitting of b{k, Vmax) for Vmax > 
362 km s~^ can be improved by including extra terms in 
the expansion and by adding the four main BAO peaks as 
follows 



b{k, Vmax) 



bo(Vmax) X 

[1 + logio(l -f- 4.0fc'-* + 3.1k^ + l.Ofc'''^)] X 



n 



Oi exp 



(15) 



Here each BAO peak is approximated as a small suppres- 
sion of the bias factor given by the last term of the equa- 
tion, ki is the wave-number of the peak and Ui ~ 0.01-0.05 
and (Ti ~ 0.01-0.02 are free parameters. The typical errors 
given by this approximation are smaller than 2% (see the 
top panel of Fig. I14|l . The values of the parameters used in 
the approximation can be seen in Table [T] 

Using Eq. (|13p and the bias factor b{k) for the veloc- 
ity cut Vmax ~ 362 km s"'^, we develop corrections to the 
bias factor for different values of Vmax. In this way, we find 
the following set of equations that yield an accuracy bet- 
ter than 4% for the range of velocities within Vmax = 180- 
370 km s-^: 



b(k, Vmax) 



&o(Vmax) X 

[l + logio(l + 4.0fc'-* -I- 3.1fc^ + l.Ofc*'^)] X 



n 



ai exp 



(k-hf 



(16) 



[l-/3o(l-e-'='/°-''')+/3ifc-/32fc^] 
where the parameters /So, 1,2 depend only on Vmax 
66.6 km s^^ 



P2 



2.18 X 10" 



= 1.64 X 10" 



205.8 km s" 

Vmax 

266.5 km s" 

Vmax 



103/14 



(17) 



1\ 1/6' 



7 CONCLUSIONS 

We presented an analysis of the clustering of 282, 068 galax- 
ies in the DR9 sample of BOSS data for a wide range of 
scales, ranging from ~ 500/i~^kpc to ~ 90/i~^Mpc. We 
separately studied the clustering in the northern and south- 
ern hemispheres, as well as for the full sky sample. We mea- 
sured the two-dimensional, projected and redshift-space cor- 
relation functions and compare the results with those ob- 
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tained from a large cosmological simulation with 1 Gpc 
on a side at a redshift of z — 0.53. The cosmological parame- 
ters adopted in the simulation are consistent with the latest 
WMAP7 results and several other probes. Our simulation, 
also known as MultiDark, is able to resolve the relevant sub- 
halo masses needed to compare with the observed satellite 
population. To bridge the gap between galaxies and dark 
matter haloes we use a simple HAM technique applied to 
the BOSS-CMASS sample. Our main results can be sum- 
marized as follows: 

• There is a 10-20% asymmetry in the projected and 
redshift-space correlation functions between the north 
and south subsamples at > 20^~^Mpc scales, which 
is better seen in the case of the projected correlation 
function. However, for both subsamples, the mean val- 
ues agree with each other within a ~ Ict level of uncertainty. 

• As com pared with the firs t-semester of BOSS results 
presented bv lWhite et al.l (|201ll '). we find a small increase in 
power in the projected correlation function at scales smaller 
than ~ l/i~^Mpc due to the improved treatment of fibre 
collisions and new corrections for systematics. However, the 
correlation functions (projected and redshift-space) decline 
by 10-20% at 10-30 Mpc scales in comparison with our 
HAM model. This is most noticeable for the north subsam- 
ple which has about four times larger statistics than its 
southern counterpart. The comparison with the south sub- 
sample yields more consistent results with MultiDark at all 
scales, both in the projected and redshift-space correlations. 

• Our Ai'-body predictions for the clustering of galaxies 
give a fair representation of the measured clustering in the 
CM ASS sample for a wide range of scales. The more con- 
sistent results between the north and south subsamples for 
the redshift-space correlation function show a remarkable 
agreement with theory: the differences are of the order of 
~ 2% on scales ranging from 2h~^ Mpc up to 20 Mpc. 
This result is more impressive when considering the fact 
that our simple HAM scheme does not include any free 
parameter. At larger distances, however, we find some 
deviations when comparing with the north subsample. 
For scales in the range of 20-40 Mpc the theoretical 
redshift-space correlation function is above the observations 
by ~ 10%. Statistically, this difference is important - e.g., 
it represents a ~ Scr deviation at ~ 30/i~^Mpc. Future 
data and a more sophisticated theoretical modeling may 
help to clarify the situation. 

• The distribution of (sub)haloes as a function of halo 
mass, as measured from our abundance-matched halo 
catalog, points towards a BOSS-CMASS galajcy population 
inhabiting haloes of mass M > 10^^ h-^ Mq, with ~ 7% of 
them being satellites orbiting centrals with M > 10^^ 
Mq . We also derived values for the HOD parameters of the 
sample using our simulation: log A/cut = 13.07 ± 0.40 and 
logA/i = 14.25 ±0.17. 

• The scale-dependent galaxy bias of BOSS galax- 
ies is likely to be & ~ 2 at scales > Wh~^ Mpc (see 
Eq. (O). Furthermore, using our simulation, we also 
compute a large-scale bias (defined as the ratio between the 



abundance-matched galaxy catalog and the extrapolated 
linear matter power spectra; see Eq. (|11[) ') and found that 
it depends on the galaxy maximum circular velocity as 
&(Vmax) = 1 + (Vmax/361 km s"^)''''^. Or ou the galaxy num- 
ber density as b{ng) = 0.0377 - 0.571ogio ("-g/ft^ Mpc"^). 
These approximations can be used to compare observational 
results with predictions for the cosmology adopted in our 
simulation. 

• The large-scale galaxy bias, defined using Eq. (|lip. has 
~ 2-4% dips at the positions of BAO peaks in the spectrum 
of fluctuations that are due to shifts caused by non-linear 
effects. In this case, we also provide very accurate fits of the 
bias as a function of maximum circular velocity of galaxies 
that can also be used to recover the non-linear galaxy power 
spectrum in terms of the extrapolated linear density field of 
matter. 
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Figure Al. Comparison of tlie real-space correlation functions of 
(sub) haloes identified with the BDM and RockStar halo finders at 
2 = in the MultiDark simulation. Left panels are for {sub)haloes 
with maximum circular velocity Vmax > 350 km while right 
panels are for Vmax > 300 km s~^. Top panels present ratios of 
the correlation functions. Solid (dashed) lines in the bottom pan- 
els show the BDM (RockStar) correlation function multiplied by 
the square of radius. 
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Figure A2. Redshift-space correlation function for different 
number densities of our MultiDark halo catalogs (including scat- 
ter) as indicated in the plot (see text). We compare these results 
with the DR9 BOSS-CMASS north and south galaxy sample in 
the redshift range 0.4 < 2 < 0.7. For clarity the error bars are 
not shown. 



sity we evaluate different correlations for the redshift-space 
and compare with observations (see Fig. IA2|I . We compute 
three different correlation functions using the MultiDark 
halo catalogs a,t z — 0.53 assuming the stochasticity model 
presented in Section 13.21 In this way, we get the following 
number densities in each case: rig = [1.8,3.6,7.2] x 10~* 
Mpc~^. The dashed line corresponds to our effective num- 
ber density, i.e. 3.6 x 10~^ Mpc""^. As expected, doubling 
and dividing this value gives a weaker and stronger cluster- 
ing signal respectively. In these extreme cases, the depar- 
ture from observations is typically above observational un- 
certainties. This result reflects the importance of correctly 
matching the abundance of haloes to the one in observations. 
However, typical departures from the effective number den- 
sity adopted in this work are smaller than 5% and do not 
appreciably change our final result. 



APPENDIX A: DEPENDENCE OF 
CLUSTERING ON DIFFERENT EFFECTS 

The dependence of clustering with the halo finder used to 
identify virialized sy stems in the simulation is shown in 
Fig.lXTI for the BDM (iKlvpin fc Holtzmanll9"97l : lRiebe et^ 
l201lh and RockStar (Behr oozi et al.ll201ll ) codes. As an ex- 
ample we select all (sub)haloes present in the Multidark sim- 
ulation sd, z — with Vmax > 300 km s^^ and Vmax > 350 
km s^^ in order to compute the real-space correlation func- 
tion of the resulting halo catalogs. As can be seen in the 
figure the convergence between both halo finders is remark- 
able; for smaU (< 1 Mpc) and large (> 70 Mpc) 
scales the difference in power is typically ~ 10%. 

To assess the clustering dependence with number den- 
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